Archimedes Optimization Algorithm-Based Feature Selection with Hybrid Deep-Learning-Based Churn Prediction in Telecom Industries

Customer churn prediction (CCP) implies the deployment of data analytics and machine learning (ML) tools to forecast the churning customers, i.e., probable customers who may remove their subscriptions, thus allowing the companies to apply targeted customer retention approaches and reduce the customer attrition rate. This predictive methodology improves active customer management and provides enriched satisfaction to the customers and also continuous business profits. By recognizing and prioritizing the relevant features, such as usage patterns and customer collaborations, and also by leveraging the capability of deep learning (DL) algorithms, the telecom companies can develop highly robust predictive models that can efficiently anticipate and mitigate customer churn by boosting retention approaches. In this background, the current study presents the Archimedes optimization algorithm-based feature selection with a hybrid deep-learning-based churn prediction (AOAFS-HDLCP) technique for telecom companies. In order to mitigate high-dimensionality problems, the AOAFS-HDLCP technique involves the AOAFS approach to optimally choose a set of features. In addition to this, the convolutional neural network with autoencoder (CNN-AE) model is also involved for the churn prediction process. Finally, the thermal equilibrium optimization (TEO) technique is employed for hyperparameter selection of the CNN-AE algorithm, which, in turn, helps in achieving improved classification performance. A widespread experimental analysis was conducted to illustrate the enhanced performance of the AOAFS-HDLCP algorithm. The experimental outcomes portray the high efficiency of the AOAFS-HDLCP approach over other techniques, with a maximum accuracy of 94.65%.


Introduction
Telecommunications has become one of the most large-scale industries in developed countries.The technological developments and a large number of operators increase the range of challenges encountered by the industry [1].Companies are actively working to survive in this competitive market, for which several approaches are being followed [2].In order to generate high revenues, three key policies are followed, such as gaining new customers, promoting the existing customers, and raising the retention time of the customers.Comparing these policies and taking the return on investment (RoI) cost of all into account, it can be inferred that the third policy is the most profitable approach [3], since retaining a present customer costs considerably less than gaining a new one.Further, it is also regarded as a simple task compared to the upselling plan.In order to implement the third policy, companies need to reduce the ability of customer churn [4].Alternatively, the prediction of the customers who are likely to leave the network can help in retaining the customer and, thus, indicates a possibly massive increase in profit if it is implemented in the early phase [5].Various studies have established that the machine learning (ML) technique is extremely effective in predicting the churning customers.This approach is implemented based on the knowledge gained from prior data [6].
Big data tasks can be performed easily with the help of artificial intelligence (AI) technology without much effort from the sales and customer support teams [7].So, it is crucial to incorporate the AI in financial activities that contain social marketing, sales, customer relationship management (CRM), and so on to effectively attract the customers and gain their trust.Since AI is a significant part of social networks and other electronic marketing sites, it is crucial to understand how to utilize, change, and execute these sites in an efficient manner [8].Customer behavior analysis seriously affects the social networking and other marketing actions of the company by permitting highly customized and predictive marketing activities.By analyzing the customer data, the companies increase their vision on what can resonate with their viewers [9].Businesses employ such data to engage in highly efficient social media and marketing activities.It can successively result in greater customer support and conversion rates.Also, the deep learning (DL) techniques can support companies in terms of optimization and automation of their promotional activities, thus saving resources and time, while it also enhances the firm's overall effectiveness.Recently, metaheuristic algorithms [10] have been widely used for hyperparameter tuning of the DL models.A few such metaheuristics include monarch butterfly optimization (MBO) [11], slime mold algorithm (SMA) [12], moth search algorithm (MSA) [13], hunger games search (HGS) [14], Runge Kutta method (RUN) [15], colony predation algorithm (CPA) [16], weighted mean of vectors (INFO) [17], Harris hawks optimization (HHO) [18], rime optimization algorithm (RIME) [19], etc.
In this background, the current study introduces the Archimedes optimization algori thm-based feature selection with hybrid deep-learning-based churn prediction (AOAFS-HDLCP) technique for telecom companies.The objective of the proposed AOAFS-HDLCP method is to predict the churning customers so as to increase the customer retention activities in the telecom industry.In the presented AOAFS-HDLCP technique, the AOAFS approach is intended to choose an optimal set of features.It has the following benefits, i.e., fast convergence rate and a fine balance between local and global search capacity, while resolving continuing problems.The current study involves the convolutional neural network with an autoencoder (CNN-AE) model for churn prediction.Further, the thermal equilibrium optimization (TEO) technique has been applied to the hyperparameter tuning method to boost the outcomes of the CNN-AE model.An extensive experimental analysis was conducted to illustrate the enhanced performance of the AOAFS-HDLCP method.Briefly, the major contributions of this research are given below:

•
An intelligent AOAFS-HDLCP method including AOAFS, CNN-AE classification, and TEO-based hyperparameter tuning is introduced for churn prediction.The AOAFS-HDLCP method does not exist in the literature to the best of the authors' knowledge.

•
The AOAFS method is designed to detect the essential attributes from the telecom industry's complex datasets, thus enhancing the efficiency and effectiveness of the churn prediction process.

•
The CNN-AE model is employed for the churn prediction process, which represents a significant contribution to the research community.It can capture intricate patterns and relationships in the data, thus potentially improving the accuracy of churn prediction compared with the rest of the traditional approaches.

•
A TEO technique has been developed to fine-tune the model parameters of the CNN-AE model in an effective manner so as to optimize the performance in terms of predicting customer churn.

Related Works
The authors in the literature [20] introduced the AI with Jaya optimization algorithm (JOA)-based churn prediction for data exploration (AIJOA-CPDE) method.In this algorithm, a primary step of feature selection was introduced by employing the JOA approach for the selection of feature sets.The proposed system utilized a bidirectional LSTM (BLS TM) algorithm for churn prediction.Finally, the chicken swarm optimization (CSO) metho d was applied in this study for hyper-parameter optimization.Kozak et al. [21] considered customer churn management to validate the efficiency of swarm intelligence machine learning (SIML) techniques.The aims of this study were of two-fold: for the existence of particular features and the objective in customer churn management and validating whether the adapted SIML technique increased the efficiency of churn-related segmentation and decision-making method.Saha et al. [22] studied ensemble learning approaches, namely, xgboost (XGB), bagging and stacking, Adaboost, gradient boosting (GBM), extremely randomized tree (ERT), and random forest (RF), standard classification algorithms, such as LR, ANN, DT, and KNN, and the DL-CNN approach in order to select the best method for developing the CCP technique.
In the literature [23], the authors developed the dynamic customer churn prediction (CCP) method for business intelligence by applying text analytics with a metaheuristic optimizer (CCPBI-TAMO) method.Additionally, the LSTM with stacked AE (LSTM-SAE) algorithm was also implemented for the classification of the feature-minimized data.Faritha Banu et al. [24] suggested the AI-based CCP for Telecommunication Business Markets (AICCP-TBM) method in which the chaotic SSO-based FS (CSSO-FS) algorithm was utilized for selecting the superior feature set.Additionally, the fuzzy-rule-based classifier (FRC) was exploited for differentiating the non-churn customers and churners.The quantum behaved particle swarm optimization (QPSO) approach was applied in this study to select the membership roles for the FRC algorithm.
In the study conducted earlier [25], the stacked bidirectional LSTM (SBLSTM) and RNN models were developed for AOA from CCP.The aim of the presented approach was to forecast the existence of customer churn from the insurance company.Primarily, the AOA approach conducted the preprocessing of the data to change the new data into a valuable format.Moreover, the SBLSTM-RNN algorithm was utilized in this study for distinguishing the churn and non-churn customers.In the literature [26], the authors created an ML approach that can forecast the effective churn for the telecom companies.The outcomes can be used in an appropriate manner, i.e., use marketing retention approaches to retain the customers as and when time passes.In this method, the authors employed recent databases and made use of preprocessing systems such as bivariate and univariate analyses and employed data visualization methods to understand the database correctly.Alshamari [27] intended to analyze and measure the user approval for the services rendered by the Saudi Telecom Company (STC), Mobily, and Zain.This kind of SA has been a dominant parameter and has been utilized to create a significant business decision in enhancing the satisfaction as well as the loyalty of the customers.In this case, the author established new approaches based on DL technique for analyzing the percentage of customer satisfaction using the openly accessible database, i.e., AraCust.
The existing literature on CCP has made significant strides in leveraging both ML and DL techniques to identify the potential churners.However, a notable research gap persists in adequately addressing the critical aspects of feature selection and hyperparameter tuning within this context.Though comprehensive studies have been conducted earlier on individual aspects of CCP, the simultaneous consideration of feature selection and hyperparameter tuning remains an underexplored territory.Feature selection plays an important role in improving the efficacy of the model by detecting the most informative variables, thus reducing both noise and computation.At the same time, hyperparameter tuning is crucial for fine-tuning the model's performance and generalization.The synergy between these two crucial aspects can potentially yield highly efficient and accurate churn prediction methods.However, the existing research often overlooks this synergy, thus resulting in suboptimal predictive abilities.Bridging this research gap is a vital element to unlock the maximum potential of CCP algorithms.This can further offer the businesses highly efficient mechanisms for customer retention and improved decision-making processes in extremely competitive industries.

The Proposed Model
In this article, the AOAFS-HDLCP system has been proposed for churn prediction in the telecom industry.The objective of the AOAFS-HDLCP method is to obtain churn prediction so as to increase the customer retention in the telecom industry.In the presented AOAFS-HDLCP technique, the AOAFS approach, CNN-AE classification, and TEO-based hyperparameter tuning are introduced.Figure 1 exhibits the working procedure of the AOAFS-HDLCP approach.
hyperparameter tuning remains an underexplored territory.Feature selection plays an important role in improving the efficacy of the model by detecting the most informative variables, thus reducing both noise and computation.At the same time, hyperparameter tuning is crucial for fine-tuning the model's performance and generalization.The synergy between these two crucial aspects can potentially yield highly efficient and accurate churn prediction methods.However, the existing research often overlooks this synergy, thus resulting in suboptimal predictive abilities.Bridging this research gap is a vital element to unlock the maximum potential of CCP algorithms.This can further offer the businesses highly efficient mechanisms for customer retention and improved decision-making processes in extremely competitive industries.

The Proposed Model
In this article, the AOAFS-HDLCP system has been proposed for churn prediction in the telecom industry.The objective of the AOAFS-HDLCP method is to obtain churn prediction so as to increase the customer retention in the telecom industry.In the presented AOAFS-HDLCP technique, the AOAFS approach, CNN-AE classification, and TEO-based hyperparameter tuning are introduced.Figure 1 exhibits the working procedure of the AOAFS-HDLCP approach.

Stage I: Feature Selection Using AOA
In this study, the AOA is designed to choose the optimum feature set.The fundamental condition of AOA is based on Archimedes' physical law of buoyancy [28].AOA is

Stage I: Feature Selection Using AOA
In this study, the AOA is designed to choose the optimum feature set.The fundamental condition of AOA is based on Archimedes' physical law of buoyancy [28].AOA is an effective model for the optimization process since it can balance the tradeoff between exploration and exploitation phases, thus making it suitable for managing difficult and multidimensional search spaces.Inspired by the Archimedes' principle of buoyancy, the AOA method formulates an effective way for its searching mechanism based on the fitness landscape, thus enabling effective convergence towards the optimal solution.It is highly adaptable, integrated to the ability of escaping the local minima and well suited for addressing real-world problems across various domains.Since the feature selection process identifies highly relevant features, the AOA's adaptability and capacity to discern informative features from a multitude of possibilities prove to be invaluable.With dynamic adjustment of the searching process based on the dataset characteristics, the AOA performs well in the detection of optimum feature subsets.It results in improved model interpretability, reduced computational complexity, and improved generalization performance.
AOA is a new metaheuristic algorithm, derived from the Archimedes' principle.Similar to other population-based metaheuristic techniques, the AOA technique begins its search method with an initial population and a random volume, density, and acceleration.Following is the list of steps followed in AOA method.
Step 1. Initialize the population location, volume, density, and acceleration using the following Equation (1): where the population number and dimension of the search range are N and D, respectively.The i th object in the N population is X i .The lower and upper limitations of the search range are lb i and ub i , respectively.N × D dimensional matrix that can be calculated randomly by the system function is denoted by rand(N, D).Volume, density, and acceleration of the i th object are vol i , den j , and acc i , correspondingly.Next, the individual X best with the optimum fitness value and the respective acc best , den best , and vol best are chosen [29].
Step 2. Upgrade the density and volume of the (t + 1)th iteration of the i th objectas given below.
In Equation ( 2), the global optimum values of density and volume are denoted by den best and vol best , correspondingly.
Step 3. Compute the density decline factor d and the parameter TF, which creates a balance between global and local convergence capability of the AOA method.
In Equation (3), the maximum and the existing iterations are denoted by t max and t, respectively.Here, TF rises with the iteration number, until TF = 1.
In Equation ( 4), as the iteration number increases, d reduces and the search is transported to the bounded area that has been detected [30].
Step 4. When TF ≤ 0.5, then the exploration and collision takes place between the objects.Using the following equation, the acceleration is updated.
In Equation ( 5), acceleration, volume, and density of the i th individual at (t + 1)th iteration are denoted by c t+1 i , vol t+1 i , and den t+1 i , correspondingly.The c t+1 i , vol t+1 i , and den t+1 i of the random individuals are denoted by acc mr , den mr , and vol mr , correspondingly.When TF > 0.5, the exploitation stage and no collision between the objects takes place.So, the acceleration is updated as given below.
Next, using the following equation, the acceleration is normalized.
In Equation ( 7), the range of normalization and fixed value at 0.9 and 0.1 are u and l, correspondingly.The step percentage of each agent change is acc t+1 i,norm .When the object i is far from the global optima, then acc t+1 i,norm value would be higher, which implies that the object is in the exploration stage.
Step 5. When TF ≤ 0.5, then the location of the population X is updated using the equation below.
In Equation ( 8), C 1 is a constant equivalent to 2. Or else, when TF > 0.5, the location of the population X is updated using Equation ( 9): Here, C 1 is a constant equivalent to 6. T = C 3 × TF; T rises with time.The parameter F changes the movement's direction and is evaluated by Equation (10): where C 3 and C 4 balance the direction of the movements to adjust the capability of the model so as to escape the local optima.
Step 6. Evaluation.Based on the updated population, the individual with the optimal fitness and their acceleration, density, and volume are selected.The procedure is reiterated until the maximal iteration is obtained [31].
The FF of the AOA-FS technique considers the classification outcomes and the amount of features selected.It diminishes the set size of the selected features and increases the classification outcomes.Hence, the FF is used for evaluating the individual solutions: In Equation (11), ErrorRate implies the classifier error rate based on the selected features.ErrorRate is estimated as a percentage of incorrect classification to the amount of classifications made in the range of [0,1].#SF shows the number of features selected and #All_F denotes the total quantity of features in the original dataset.α controls the prominence of classification quality and the subset length.α is fixed as 0.9 in the current study.

Stage II: Churn Prediction Using CNN-AE Model
The CNN-AE model is used for churn prediction.CNN model is a kind of DL method and is one of the state-of-art techniques for CV applications, owing to its considerable benefits [32].CNN technique has a primary benefit, i.e., feature learning, and it can extract and learn relevant features.Due to its deep architecture, the CNN technique also learns from abundant datasets.Feature extraction is a main and challenging problem for pattern prediction.The features are highly essential since they represent the image properties.CNN is a DL approach used for the extraction of features that give a self-learning layer.The component in the encoded vector does not mean to encode a single feature.In the decoding network, masses of parameters exist while a combination could encode and construct a vast number of features.Thus, the CNN-AE technique is used to implement the unsupervised learning for dimension reduction and feature extraction.The distance between the vectors is much more rapid to compute since the smaller feature is projected to be a low dimension.Figure 2 demonstrates the infrastructure of the CNN-AE model.
struct a vast number of features.Thus, the CNN-AE technique is used to implement the unsupervised learning for dimension reduction and feature extraction.The distance between the vectors is much more rapid to compute since the smaller feature is projected to be a low dimension.Figure 2 demonstrates the infrastructure of the CNN-AE model.
CAE has a similar structure to CNN that comprises pooling layers and convolutional filters.However, the only difference between CNN and CAE is that both input and output nodes have equal dimensions in CAE.The recreated data are compared to the input dataset.The learning method is not reliant on the labeled dataset.The CNN-AE is a category of unsupervised learning method, while CNN is a kind of DL method with multiple convolutional layers.It is primarily exploited for feature extraction process and image processing tasks [33].CAE uses a convolution operator for encoding the input features and replicating them in the output with a minimal amount of reconstructed errors.CAE consists of output layer  feature maps and  convolution kernels.The input mapping feature is generated from the input layer while  corresponds to the number of input channels.The hidden depiction of CAE of the  ℎ feature map in the encoder is described using Equation (12), where  denotes the activation function and * indicates the 2D convolution.In the decoder, the reconstruction is described using a subsequent equation, where  shows the hidden feature maps and  denotes the bias as per the input channel [34].CAE has a similar structure to CNN that comprises pooling layers and convolutional filters.However, the only difference between CNN and CAE is that both input and output nodes have equal dimensions in CAE.The recreated data are compared to the input dataset.The learning method is not reliant on the labeled dataset.The CNN-AE is a category of unsupervised learning method, while CNN is a kind of DL method with multiple convolutional layers.It is primarily exploited for feature extraction process and image processing tasks [33].CAE uses a convolution operator for encoding the input features and replicating them in the output with a minimal amount of reconstructed errors.CAE consists of output layer m feature maps and m convolution kernels.The input mapping feature is generated from the input layer while n corresponds to the number of input channels.The hidden depiction of CAE of the k th feature map in the encoder is described using Equation (12), where σ denotes the activation function and * indicates the 2D convolution.In the decoder, the reconstruction is described using a subsequent equation, where H shows the hidden feature maps and c denotes the bias as per the input channel [34].

Stage III: Parameter Tuning Using the TEO Method
Ultimately, the TEO has been implemented in the current study for fine-tuning the parameters, compared to the CNN-AE architecture.The target of hyperparameter selection is critical for fine-tuning the configuration of the CNN-AE technique.Optimum hyperparameters considerably impact the effectiveness of the model, while they also affect the model's capability for effectually taking complex features and generalizing them.By implementing the TEO technique, the research goal is to proficiently direct the hyperparameter space and enhance the capabilities of CNN-AEs in the context of CCP within the telecom industry.The TEO method is inspired from the unique ability to represent the principles of thermal equilibrium in physical systems, thus enabling a robust analysis of the hyperparameter space.The TEO system provides different benefits in the optimization process, mainly in hyperparameter tuning for the DL models.Inspired from the principles of thermal equilibrium, the TEO technique strikes an active balance between the exploration and exploitation phases.Thus, it can navigate complex solution spaces, mimic physical methods, provide greater convergence and solution quality, and can be combined with local and global search approaches.The versatility and efficiency of the TEO method make it a favorable choice for fine-tuning the hyperparameters in architectures, namely CNN-AE.Further, it is also applicable in case of CCP in the telecom industry, where it yields an enriched performance and can accomplish optimum configurations.
According to the Newton's law of cooling, TEO is a novel optimization technique, which describes that the rate of heat loss for an object is directly proportionate to the temperature difference between the object and its surrounding environments at a certain point [35].In the current research work, some search agents are represented as reference, while some as recognized nodes (cooling objects).Unrecognized NLOS nodes or nodes, on the other hand, are represented as environment.The heat exchange between the environment and the cooling objects is mathematically modelled as follows: T p−env i and T x−env i represent the earlier and the modified temperatures of the environment's objects, respectively, with cv 1 and cv 2 being considered as the variables used for controlling the prediction or localization operations, correspondingly [36].Furthermore, C I N and MaxIter refer to the existing and the maximum iteration counts.In addition to this, the initial phase of the TEO optimization technique updates the temperature of the objects and their surrounding environments as given below.
Now, the rnd value is compared to the predefined prevention threshold that has been implemented earlier for randomly selecting a single dimension of the i th searching agent to restore its value based on Equation (18): In Equation ( 18), T j represents the j th variable of the i th searching agent, with T, Min and T, Max correspondingly indicating the lower and upper thresholds of the j th variable [37].Fitness selection has been an essential component in the TEO methodology.An encoder solution is applied to estimate the outcome of the solution candidate.Therefore, the accuracy value is the foremost form applied for designing the FF.
Here, the true and false positive values are denoted by TP and FP, respectively.

Results and Discussion
The developed method was validated using the Python 3.8.5 tool on a PC configured with i5-8600k, GeForce 1050Ti 4 GB, 16 GB RAM, 250 GB SSD, and 1 TB HDD specifications.Diverse Python Packages were implemented, namely opencv-python, numpy, matplotlib, tensorflow (GPU-CUDA Enabled), keras, pickle, sklearn, and pillow.The CCP performance of the AOAFS-HDLCP technique was investigated using the customer churn prediction: Telecom Churn Dataset [38], including 3,333 data instances with 21 attributes as described in Table 1.The dataset was downloaded from the Kaggle repository.The set of measures, used for examining the classification outcomes, are accuracy (accu y ), precision (prec n ), recall (reca l ), and F-score (F score ).
Precision is used to measure the proportion of the predicted positive instances out of each instance that is predicted as positive.
Recall is used to measure the proportion of the positive samples classified.
Accuracy is used to measure the proportion of the classified samples (positive and negative) against the overall samples classified.
F-score combines the harmonic mean of prec n and reca l .The confusion matrices generated by the AOAFS-HDLCP method on 90:10 and 80:20 of the TRS/TSS datasets are demonstrated in Figure 3.The outcomes portray the effectual recognition of the proposed model in terms of churn and non-churn samples on all the class labels.The CCP outcomes of the AOAFS-HDLCP method under 90:10 and 80:20 of the TRS/TSS datasets are shown in Table 2.The simulation values demonstrate that the AOAFS-HDLCP method categorized the churn and non-churn samples effectively.With 90% TRS, the AOAFS-HDLCP model provided an average   of 93.58%,   of 96.63%,   of 93.58%,   of 95.03%, and an   of 93.58%.In addition, with 10% TSS, the AOAFS-HDLCP technique offered an average   of 90.59%,   of 94.89%,   of 90.59%,   of 92.59%, and an   of 90.59%.Also, with 80% TRS, the AOAFS-HDLCP model yielded an average   of 90.62%,   of 93.88%,   of 90.62%,   of 92.15%, and an   of 90.62%.At last, with 20% TSS, the AOAFS-HDLCP method accomplished an average   of 92.01%,   of 94.34%,   of 92.01%,   of 93.13%, and an   of 92.01%.The CCP outcomes of the AOAFS-HDLCP method under 90:10 and 80:20 of the TRS /TSS datasets are shown in Table 2.The simulation values demonstrate that the AOAFS-HDLCP method categorized the churn and non-churn samples effectively.With 90% TRS, the AOAFS-HDLCP model provided an average accu y of 93.58%, prec n of 96.63%, reca l of 93.58%, F score of 95.03%, and an AUC score of 93.58%.In addition, with 10% TSS, the AOAFS-HDLCP technique offered an average accu y of 90.59%, prec n of 94.89%, reca l of 90.59%, F score of 92.59%, and an AUC score of 90.59%.Also, with 80% TRS, the AOAFS-HDLCP model yielded an average accu y of 90.62%, prec n of 93.88%, reca l of 90.62%, F score of 92.15%, and an AUC score of 90.62%.At last, with 20% TSS, the AOAFS-HDLCP method accomplished an average accu y of 92.01%, prec n of 94.34%, reca l of 92.01%, F score of 93.13%, and an AUC score of 92.01%.The confusion matrices generated by the AOAFS-HDLCP system on 60:40 and 70:30 TRS/TSS datasets are illustrated in The CCP outcomes of the AOAFS-HDLCP system at 60:40 and 70:30 TRS/TSS datasets are shown in Table 3.The achieved outcomes indicate that the proposed AOAFS-HDLCP technique categorized the churn and non-churn samples in an effective manner.With 60% TRS, the AOAFS-HDLCP method provided an average   of 87.18%,   of 96.83%,   of 87.18%,   of 91.21%, and an   of 87.18%.In addition, with 40% TSS, the AOAFS-HDLCP method yielded an average   of 91.58%,   of 97.70%,   of 91.58%,   of 94.33%, and an   of 91.58%.Also, with 70% TRS, the AOAFS-HDLCP method produced an average  of 93.09%,  of The CCP outcomes of the AOAFS-HDLCP system at 60:40 and 70:30 TRS/TSS datasets are shown in Table 3.The achieved outcomes indicate that the proposed AOAFS-HDLCP technique categorized the churn and non-churn samples in an effective manner.With 60% TRS, the AOAFS-HDLCP method provided an average accu y of 87.18%, prec n of 96.83%, reca l of 87.18%, F score of 91.21%, and an AUC score of 87.18%.In addition, with 40% TSS, the AOAFS-HDLCP method yielded an average accu y of 91.58%, prec n of 97.70%, reca l of 91.58%, F score of 94.33%, and an AUC score of 91.58%.Also, with 70% TRS, the AOAFS-HDLCP method produced an average accu y of 93.09%, prec n of 96.64%, reca l of 93.09%, F score of 94.76%, and an AUC score of 93.08%.At last, with 30% TSS, the AOAFS-HDLCP method accomplished an average accu y of 94.65%, prec n of 96.92%, reca l of 94.65%, F score of 95.74%, and an AUC score of 94.65%.Both TR_accu y and VL_accu y outcomes of the AOAFS-HDLCP methodology for 70:30 TRS/TSS dataset are illustrated in Figure 5.The TL_accu y is evaluated by estimating the AOAFS-HDLCP system on the TR data, while VL_accu y is determined by the assessment of the proposed method using test data.The simulation values show that both TR_accu y and VL_accu y values increase with the maximum number of epochs.Hereafter, the effectiveness of the AOAFS-HDLCP method increases on the TR and TS data with an increase in the number of epochs.
The TR_loss and VR_loss outcomes of the AOAFS-HDLCP model under 70:30 of the TRS/TSS are shown in Figure 6.The TR_loss represents the error between the prediction performance and original values at the TR dataset.The VR_loss denotes the performance evaluation of the AOAFS-HDLCP method on the validation dataset.The simulation value demonstrates that both TR_loss and VR_loss tend to reduce with an increase in the number of epochs.This provides the superior outcome of the AOAFS-HDLCP algorithm and its ability to produce accurate classification.The minimized TR_loss and VR_loss values reveal the high efficiency of the AOAFS-HDLCP system in capturing patterns and correlations.
Both _  and _  outcomes of the AOAFS-HDLCP methodology for 70:30 TRS/TSS dataset are illustrated in Figure 5.The _  is evaluated by estimating the AOAFS-HDLCP system on the TR data, while _  is determined by the assessment of the proposed method using test data.The simulation values show that both _  and _  values increase with the maximum number of epochs.Hereafter, the effectiveness of the AOAFS-HDLCP method increases on the TR and TS data with an increase in the number of epochs.The _ and _ outcomes of the AOAFS-HDLCP model under 70:30 of the TRS/TSS are shown in Figure 6.The _ represents the error between the prediction performance and original values at the TR dataset.The _ denotes the performance evaluation of the AOAFS-HDLCP method on the validation dataset.The simulation value demonstrates that both _ and _ tend to reduce with an increase in the number of epochs.This provides the superior outcome of the AOAFS-HDLCP algorithm and its ability to produce accurate classification.The minimized _ and _ values reveal the high efficiency of the AOAFS-HDLCP system in capturing patterns and correlations.In Figure 8, the ROC analysis curve achieved by the AOAFS-HDLCP algorithm for 70:30 TRS/TSS dataset is shown.This figure indicates that the AOAFS-HDLCP system   Table 4 shows the results of the comparison analysis conducted between the proposed AOAFS-HDLCP method and the existing methods [20,39,40].The experimental values infer that the DR and LR models exhibited poor results, whereas the SVM, SGD, and RM-SProp approaches achieved slightly increased performance.

Conclusions
In the current study, the AOAFS-HDLCP technique has been introduced for churn prediction in the telecom industry.The objective of the presented method is to accomplish churn prediction so as to increase the customer retention process in the telecom industry.In the presented technique, the AOAFS approach, CNN-AE classification, and TEO-based hyperparameter tuning have been developed.In the current research work, the AOAFS is designed to choose an optimal set of features.The CNN-AE model has been involved in churn prediction process.The TEO technique has been applied to the hyperparameter tuning process to optimize the outcomes of the CNN-AE system.A widespread experimental analysis was conducted to illustrate the superior performance of the AOAFS-HDLCP approach.The achieved findings portray the significant performance of the AOAFS-HDLCP method over other techniques, with an improved accuracy of 94.65%.In the future, studies can focus on handling outlier removal and class imbalance data handling problems.

Figure 6 .
Figure 6.Loss curve of the AOAFS-HDLCP model under 70:30 of TRS/TSS A wide range of PR analysis was conducted upon the AOAFS-HDLCP model upon the 70:30 TRS/TSS dataset and the results are shown in Figure 7.The simulation values infer that the AOAFS-HDLCP approach produced the maximum PR values.Additionally, the AOAFS-HDLCP technique attained the maximum PR performance in all the classes.In Figure8, the ROC analysis curve achieved by the AOAFS-HDLCP algorithm for 70:30 TRS/TSS dataset is shown.This figure indicates that the AOAFS-HDLCP system

Figure 7 . 18 Figure 7 .
Figure 7. PR analysis of the AOAFS-HDLCP methodology under 70:30 of TRS/TSS.In Figure 8, the ROC analysis curve achieved by the AOAFS-HDLCP algorithm for 70:30 TRS/TSS dataset is shown.This figure indicates that the AOAFS-HDLCP system achieved an improvement in the ROC values.The outcomes provide valuable insights about the tradeoffs between the rate of TPR and FPR.It provides the predictive outcomes of the presented technique on the classification of different classes.

Table 1 .
Details of the database.